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Primary biliary cirrhosis (PBC) is an uncommon autoimmune disease with a homoge- 
neous clinical phenotype that reflects incomplete disease concordance in monozygotic 
(MZ) twins. We have taken advantage of a unique collection consisting of genomic DNA 
and mRNAfrom peripheral blood cells of female MZ twins (n = 3 sets) and sisters of similar 
age (n = 8 pairs) discordant for disease. We performed a genome-wide study to investigate 
differences in (i) DNA methylation (using a custom tiled four-plex array containing tiled 50- 
mers 19,084 randomly chosen methylation sites), (ii) copy number variation (CNV) (with a 
chip including markers derived from the 1000 Genomes Project, all three HapMap phases, 
and recently published studies), and/or (iii) gene expression (by whole-genome expression 
arrays). Based on the results obtained from these three approaches we utilized quanti- 
tative PCR to compare the expression of candidate genes. Importantly, our data support 
consistent differences in discordant twins and siblings for the (i) methylation profiles of 60 
gene regions, (ii) CNV of 10 genes, and (iii) the expression of 2 interferon-dependent genes. 
Quantitative PCR analysis showed that 17 of these genes are differentially expressed in 
discordant sibling pairs. In conclusion, we report that MZ twins and sisters discordant for 
PBC manifest particular epigenetic differences and highlight the value of the epigenetic 
study of twins. 
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INTRODUCTION 

Primary biliary cirrhosis (PBC) is a female-predominant autoim- 
mune liver disease affecting the small interlobular bile ducts, 
ultimately leading to periportal fibrosis and cirrhosis (1). Sim- 
ilar to most autoimmune diseases, PBC onset results from the 
interplay of genomic predisposition and environmental factors 
(2-5) with a possible role for sex factors (6). Recent genome-wide 
association studies (GWAS) have reported consistent associations 
with polymorphisms of genes such as IL12RA and HLA class II 
in subgroups of patients with PBC (7-13) and a pathway analysis 
was recently performed (13). PBC concordance rates in dizygotic 



(DZ) and monozygotic (MZ) twins are significantly different being 
0 and 63%, respectively, thus supporting the role of both genetic 
and environmental factors (14) with the latter supported also by 
epidemiology (15, 16). 

Promoter methylation influences gene expression (GEX) and 
our group recently reported differences in the DNA methylation 
and expression of two X-linked genes (PIN4 and CLIC2) in MZ 
twins discordant for PBC (17). On the other hand, copy num- 
ber variations (CNV) are the result of duplications and other 
rearrangements (18) occur de novo at much higher rates than 
single nucleotide variants, and may regulate GEX (19). While 
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sharing their genomic sequence, MZ twins may develop different 
phenotypes over the years because of increasing differences in 
DNA methylation (20) and CNV (21, 22). 

We have taken advantage of a unique DNA collection of iden- 
tical and non-identical twins with PBC and performed a genome- 
wide investigation to determine differences in DNA methylation, 
CNV, and GEX. Our data identify 17 candidate genes that are 
significantly under- or up-regulated in affected individuals and 
we suggest that these might constitute new candidates as disease 
markers of genetic determinants. The value of this approach is 
highlighted and suggests the need for the study of a large number 
of patients and cell subpopulations (23) to support this thesis. 

MATERIALS AND METHODS 
SUBJECTS 

Blood samples from three MZ twins pairs discordant for PBC 
whose zygosity had been determined using microsatellite analy- 
sis (Ballestar) and eight sister pairs of similar age (within 5 years) 
discordant for PBC studied (Table 1). Serum antimitochondr- 
ial antibodies (AMA) were positive at indirect immunofluores- 
cence in all patients with PBC and none of the healthy twins 
and sisters. In these subjects, PBC was excluded when serum 
AMA was negative and serum alkaline phosphatase was within 
normal range on two different occasions. Genomic DNA was iso- 
lated from peripheral blood mononuclear cells (PBMCs) using the 
QIAamp Blood Midi Kit (Qiagen, Valencia, CA, USA) and stored 
at — 20°C until used. Additional blood samples were obtained 
using Tempus™ Blood RNA Tubes (Applied Biosystems, Fos- 
ter City, CA, USA) that were stored at -20°C until mRNA was 
extracted using the RNeasy Mini Kit (Qiagen, Valencia, CA, USA) 
and then stored at — 80°C. This study was performed in com- 
pliance with the ethical standards of medicine and, following 
approval by the local IRB, informed consents were obtained from 
all patients and controls in accordance with the Declaration of 
Helsinki. 

METHYLATED DNA IMMUNOPRECIPITATION AND METHYLATION 
ANALYSIS 

DNA samples of three MZ twin sets (#1/2, 9/52, and 24/57; see 
Table 1) were sonicated and then immunoprecipitated with a 
monoclonal antibody that specifically recognizes 5-methylcytidine 
(Roche NimbleGen, Madison, WI, USA). DNA fragments were 
converted into PCR-amplifiable OmniPlex™ Library molecules 
flanked by universal primer sites and the library amplified by PCR 
using universal primers and a limited number of cycles. Immuno- 
precipitated and reference DNA were tagged, respectively, with 
cyanine-5 (Cy5) and cyanine-3 (Cy3)-labeled random 9-mers and 
then hybridized by the NimbleGen Array Hybridization Kit (Roche 
NimbleGen, Madison, WI, USA). 

A four-plex array was custom-designed to include 998 X chro- 
mosome and 18,086 randomly selected autosomal chromosome 
promoter sites (Roche NimbleGen, Madison, WI, USA) and sam- 
ples analyzed following the manufacturers protocols. First, Nim- 
blescan software (Roche NimbleGen, Madison, WI, USA) was 
utilized for DNA methylation data analysis using a threshold 
p-value of 0.05 equivalent to 1.31 based on the Gaussian dis- 
tribution of data. Second, exclusive elements corresponding to 



Table 1 | Summary of the patients with PBC and the corresponding 
healthy sibling and twin sisters utilized in the study. 



PBC 


Age 


Serum 


Control # 


Age 


Serum 


case # 


(years) 


AMA 


(twin/sibling) 




AMA 


2 


60 


Pos 


1 (MZtwin) 


60 


Neg 


9 


60 


Pos 


52 (MZ twin) 


60 


Neg 


24 


64 


Pos 


57 (MZ twin) 


64 


Neg 


4 


62 


Pos 


10 (Sister) 


59 


Neg 


5 


55 


Pos 


14 (Sister) 


59 


Neg 


6 


52 


Pos 


11 (Sister) 


55 


Neg 


12 


61 


Pos 


7 (Sister) 


64 


Neg 


13 


70 


Pos 


8 (Sister) 


68 


Neg 


15 


54 


Pos 


16 (Sister) 


57 


Neg 


27 


45 


Pos 


26 (Sister) 


43 


Neg 


34 


41 


Pos 


33 (Sister) 


45 


Neg 


35 


64 


Pos 


50 (Sister) 


60 


Neg 



specific microarray probes were identified in affected and healthy 
subjects and peaks found only in either group were selected 
for further analysis. Third, elements of interest were inserted 
into the UCSC Genome Browser (GRCh36/hgl9) to identify 
corresponding genes. 

COPY NUMBER VARIATION ANALYSIS 

Copy number variation analysis was performed on genomic DNA 
from one MZ twin set (#1/2; see Table 1) using the Infinium R HD 
Assay Super platform (Illumina, San Diego, CA, USA): in partic- 
ular, we utilized the HumanOmnil-Quad BeadChip that includes 
markers derived from the 1000 Genomes Project, all three HapMap 
phases, and recently published studies (7, 9, 24, 25) as well as ade- 
quate tools for quality control, CNV calling, and validation. The 
protocol included the initial DNA preamplification, fragmenta- 
tion, and precipitation. Data obtained from four-plex chips were 
analyzed using iScan and Illumina BeadArray system (Illumina, 
San Diego, CA, USA) followed by the GenomeStudio software 
(Illumina, San Diego, CA, USA). The position of each probe and 
the number of copies for each probe were determined using the 
PennCNV platform based on a hidden Markov model algorithm 
(26). The UCSC Genome Browser was then used to determine the 
genes involved and the number of CNV. 

MICROARRAY GENE EXPRESSION ANALYSIS 

We utilized RNA samples from eight pairs of sisters of similar 
age (Table 1) discordant for PBC. In the first part, we performed 
a whole-genome microarray comparison of transcripts to detect 
consistently up- or down-regulated genes in affected subjects. 
We obtained biotin-labeled cRNA using the Illumina R TotalPrep 
RNA Amplification Kit (Illumina, San Diego, CA, USA) and used 
the whole-genome Gene Expression Direct Hybridization Assay 
(Illumina, San Diego, CA, USA) including 24,500 transcripts. 
Microarrays were scanned using the BeadArray Reader (Illumina 
Inc., San Diego, CA, USA) and data were processed using Bead- 
Studio software (Illumina Inc.). Expression data were quantified 
using a cut-off for significant gene differences of p < 0.05 with a 
twofold difference in expression as described elsewhere (27). 
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RT-PCR EXPRESSION ANALYSIS 

Real-time PCR was utilized to analyze samples prepared from 1 u,g 
total RNA according to high-capacity cDNA reverse transcription 
kit (Applied Biosystems, Foster City, CA, USA) in seven pairs of 
sisters of similar age (#15/16, 5/14, 6/11, 7/12, 8/13, 26/27, 33/34; 
see Table 1). Micro-fluidic real-time quantitative PCR cards were 
customized to include single-plex assays for all candidate genes 
obtained with DNA methylation, CNV, and GEX analyses. Genes 
reported by GWAS studies were also included among the candi- 
dates (7, 9, 24, 25). All samples were analyzed in duplicate, and 
included 94 candidate genes and the 18S and |3-actin housekeep- 
ing genes. Analyses were performed using the ABI Prism 7900HT 
Sequence Detection System (SDS 2.2.2 software, Applied Biosys- 
tems, Foster City, CA, USA). PCR cycle conditions included 50°C 
for 2min, 94.5°C for lOmin, and 40 cycles of 97°C for 30 s fol- 
lowed by 59.7°C for 1 min. The preliminary study of all 10 samples 
defined the maximum allowable cycle threshold (CT) that was 
set at 38 while outliers exceeding this threshold were excluded 
from the statistical analysis and no adjustment of p-value was 
performed. Internal controls for calculating expression levels of 
candidate genes were 18S and ACTB (beta-actin). The analysis 
has been performed with Data Assist version 3 statistical software 
(Applied Biosystems). The software exports data from real-time 
PCR and performs relative quantification analysis. The data assist 
analysis contains: C t data, sample design, assay design, average of 
C t values of replicates, AC t , normalized versus endogenous con- 
trols C t values ± SD and fold change (RQ) files, which displays RQ 
min and RQ max for each sample. /> Value is calculated from AC t 
files. 

A heat map is used to visualize the data and illustrates, for all 
case/control sibling pairs, GEX in red/green color based on A C t 
values using Pearson's correlation. The neutral/middle expression 
was set as the median of all the A C t values from all samples, the 
red indicated an increase with a AC t value below the middle level 
and the green indicated a decrease with AC t above the middle 
level. 

PATHWAY ANALYSIS 

Gene networks were generated through the use of Ingenuity 
Pathways Analysis software 8.0. Edition (Ingenuity Systems, http: 
//www.ingenuity.com). Each gene identified was mapped to its 
corresponding gene object in the Ingenuity Pathways Knowledge 
Base and overlaid onto a global molecular network. The SDS 2.2.2 
software (Applied Biosystems, Foster City, CA, USA) was used to 
determine changes in expression of a target in an experimental 
sample relative to the same target in a reference sample with the 
Student's (--test and p-value <0.05 were considered statistically 
significant. We utilized Data Assist Software version 3 statistical 
software (Applied Biosystems) and Stata 8.0 for Macintosh (Stata 
Corp, College Station, TX, USA) for statistical analyses. 

RESULTS 

DNA METHYLATION 

DNA methylation comparison showed 60 differentially methy- 
lated regions (DMR) in affected compared to the non-affected 
twin (p < 0.05 for each of the three discordant twin pairs). These 
DMR corresponded to 51 genes on the X chromosome and 9 



genes on autosomal chromosomes, listed in Table 2. For each 
DMR, the PBC proband was hypermethylated compared with the 
non-affected twin. 

COPY NUMBER VARIATIONS 

Ten CNV were discordant between affected and the non-affected 
twin in one twin set. The healthy twin had four CNVs that were 
missing in the affected twin and six CNVs were present only in 
the affected twin. The CNVs were found in the following genes: 
RYBP ring 1, YY1 binding protein, HERV-V2 envelope glyco- 
protein ENW2, POTEK P ankirin domain family member K 
pseudogene, THSD7A thrombospondin type 1 domain containing 
7A = KIAA0960, GOLGA8A golgin A8 family member A, BPTF 
bromodomain PHD finger transcription factor, and C17orf58 
open reading frame. Two additional CNV did not correspond to 
known genes. 

MICR0ARRAY GENE EXPRESSION 

Gene expression analysis using the genome-wide microarray 
showed two genes significantly down-regulated in PBC com- 
pared to the healthy sister in each of the eight discordant sis- 
ter pairs. These genes were IFIT1 (interferon-induced protein 
with tetratricopeptide repeats; chromosome 10q23.31) and IFI44L 
(interferon-induced protein 44-like; chromosome lp31.1) and 
both are interferon-induced (28). 

RT-PCR ANALYSIS 

To provide additional support for our initial findings, we used RT- 
PCR to evaluate expression of each of the candidates that emerged 
from the DNA methylation (60), CNV (10), and expression stud- 
ies (2), as well as previously reported GWAS in seven pairs of 
discordant sisters of similar age (Table 1) (7-9, 12, 13, 24, 25). 
Our data assist analysis contained: C t data, sample design, assay 
design, average of C t values of replicates, AC t , normalized ver- 
sus endogenous controls C t values ± SD and fold change (RQ) 
files, which displays RQ min and RQ max for each sample, p- 
Value was calculated from AC t files. Data assist v3.0 software 
was used with results exported from real-time PCR and for rel- 
ative quantification analysis. Graphic result in heat map visualized 
analyzed data (Figure 1). Heat map showed, for all case/control 
sibling pairs, genes expression in red/green color based on AC t 
values using Pearson's correlation. The neutral/middle expression 
was set as the median of all the ACt values from all samples, 
the red indicated an increase with a A C t value below the middle 
level and the green indicated a decrease with AC t value above 
the middle level. The heat map from all samples is represented in 
Figure 1. Among the entire set of candidate genes, we found five 
genes that were underexpressed in at least three of seven sibling 
pairs with FC < 0.5 (CXCR5, HLA-B, IPI44L, IFIT1, SMARCA1) 
and one overexpressed gene in at least three of seven pairs with an 
FC > 2 (716). Additional 11 genes showed a widely variable expres- 
sion profile in each sibling pair {CD80, FAM104B, HLA-DQB1, 
HLA-DRB1, HLA-G, MTCP1, NHS, PIN4, PRPF38A, THSD7A, 
and TNFAIP2) (Table 3; Figure 2). 

PATHWAY ANALYSIS 

Pathway analyses were performed using the 1 7 resulting genes from 
our study and demonstrated that the most representative functions 
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Table 2 | Differentially methylated genes in PBC-discordant MZ twins. 



Gene 


Chr/base pair (bp) a 


Description/function 


Localization 13 


ABCD1 


chrX: 1 52989993-1 52991024 


ATP-binding cassette, sub-family D (ALD), member 1 


PM 


ATP12A 


chr13:25254828-25254890 


ATPase, H+/K+ transporting, non-gastric, alpha polypeptide 


PM 


ATP5A1 


chr18:43678161-43678731 


ATP synthase, H+ transporting, mitochondrial F1 complex, alpha subunit 1, 
cardiac muscle 


C 


DLArJ I 


cnrA. i bzyoy4yo— i bzyyuubo 


B cell receptor-associated protein 31 


b 


dpm 


cnrA. i bz /oUbzy— ioz/o I oyo 


Biglycan 


c 

L 


DDppO 


cnrA. i o4zyyzo i— i b4zyyyzb 


R DP A 1 /D DP A 9 < -.^->>-i + 'iii-ii>-n-i ^.->>"v-!i-J/-%\/ r- i i Ui i >-i i+ O 

bntA i/DnbAz-containing complex, suounit o 


M 
IN 


PPP 
brr 


CnrA.4/4oo4 lo— 4/40004/ 


Complement factor properdin 


c 

L 


PUI CT7 


„u r V' A d A OA d A 7 /1C/1Q/1QEQ 
CnrA.4o4o4D4 /— 4o4o4oDo 


Carbohydrate (A/-acetylg ucosamine 6-0) su fotransferase 7 


b 


L I AU I A, b I Ab f D 


cnrA. looo i Joy i— iboo 14 id i 


Cancer/testis antigen 1A, B 


b 


DDX41 


ch r5 : 1 769439 1 1-1 7694448 1 


DEAD (Asp-Glu-Ala-Asp) box polypeptide 41 


N 


FAM104B 


chrX:55187570-55188140 


Family with sequence similarity 104, member B 


U 


FGD1 


chrX:54521696-54522266 


FYVE, RhoGER and PH domain containing 1 


C 


FUNDC2 


chrX: 1 54255133-1 54255703 


FUN14 domain containing 2 


C 


GAGE12B, 121, 

9A £ 7 Q 
ZA, 0, /, O 


chrX:4931 5376-4931 5946 


G antigen 1 , 5, 7 


u 


b 1 rbrb 


cnrA.zoUboo— zo Izbb 


GTP-binding protein 6 (putative) 


1 1 

U 


uppc 
MUto 


cnrA. 1 1 izyozD— 1 1 izyboo 


Holocytochrome c synthase 


b 


UPtYPi/1 


^hr-9 ■177n1fi71R 177n171K7 

cnrz. i / /u lb/ 1 b— I / /u I / I b/ 


Homeobox D4 


M 


lUnou 


cnrA. i oouoy /4Z— i d ouoyy44 


Isocitrate dehydrogenase 3 (NAD-I-) gamma 


f 
b 


i nc 
1 Uo 


cnrA. I4obobb lb— i4obo/ lob 


Iduronate 2-sulfatase 


b 


1 D A Y 1 
InAK I 


cnrA. I bozobo I /— I bozoboo/ 


lnterleukin-1 receptor-associated kinase 1 


DKA 
rIVI 


l\D I DUD 


cnr io.4 1 /ubozy— 4 i /u/oyy 


Kelch repeat and BTB (POZ) domain containing 6 


1 1 

u 


IVI Ab t Ao 


cnrA. lb lyjo I b4— I b lyooobb 


Melanoma antigen fami y A, 3 


1 1 

U 


IVIAb CAD 


cnrA. id i oo / lob— i d i od / / ud 


Melanoma antigen fami y A, 6 


1 1 
U 


APFAQ 
IVIAb tAy 


cnrA. 1 40 / yo4u i— 1 4o / y jddcs 


Melanoma antigen family A, 9 


1 1 

u 


IVIAb EU4B 


^hrY - R1 Q9Q9r~lQ £ 1 C19Q99Q 

cnrA.D i yzozuy— d i yzyzzo 


Melanoma antigen fami y D, 4B 


1 1 

u 


l\ /ITPD1 

M I Ur I 


cnrA. i D4zyy4 iu— i b4zyyb i z 


MatureTce l proliferation 1 


b 


l\ ATKA 1 
IVI I IVI I 


ohrY- 1/1Q7Q7Q/1Q 1/ID7Q7Q1Q 

cnrA. i4y /o / J4o— i4y /o /y i o 


Myotubularin 1 


b 


l\ /ITI\ /IRQ 

IVI I IVI no 


cnrA. bob I4yb4— bob I bbz4 


Myotubularin-related protein 8 


i i 

u 


MUC 

IN no 


^KrY' 17QOQ/1D1 17QQQQCQ 

cnrA. i / jyo4o i— i /oyjyby 


Nance— Horan syndrome (congenital cataracts and denta anomalies) 


M 
IM 


0RC1L 


chrl :52869831-52870401 


Origin recognition complex, subunit 1 


N 


CDK16 


chrX:47078470-47079428 


Cyclin-dependent kinase 16 


C 


PDZD4 


chrX: 1 53095693-1 53096406 


PDZ domain containing 4 


C 


PHF16 


chrX:46772444-46773014 


PHD finger protein 16 


N 


PRKX 


chrX:3631431-3632001 


Protein kinase, X-linked 


C 



(Continued) 
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Table 2 | Continued 



Gene 


Chr/base pair (bp) a 


Description/function 


Localization 11 


PRPF38A 


chr1:52869831-52870401 


PRP38 pre-mRNA processing factor 38 (yeast) domain containing A 


N 


RIBC1 


chrX:53449681-53450600 


RIB43A domain with coiled-coils 1 


U 


RNF128 


chrX: 1 05970276-1 05970478 


Ring finger protein 128 


C 


SCLY 


chr2:238969783-238970252 


Selenocysteine lyase 


c 


SHROOM4 


chrX:50557007-50557209 


Shroom family member 4 


PM 


SLC10A3 


chrX:1 53718280-153718749 


Solute carrier family 10 (sodium/bile acid cotransporter family), member 3 


PM 


SLC9A6 


chrX: 1 35067977-1 35068547 


Solute carrier family 9 (sodium/hydrogen exchanger), member 6 


PM 


SLITRK2 


chrX:144903417-144903908 


SLIT and NTRK-like family, member 2 


U 


SLITRK4 


chrX:142722571-142723141 


SLIT and NTRK-like family, member 4 


u 


SMARCA1 


chrX: 1 28657308-1 28657936 


SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin, 
sub-family A, member 1 


N 


SSR4 


chrX:1 530601 91-1 53060761 


Signal sequence receptor, delta (translocon-associated protein delta) 


C 


TAF9B 


chrX:77394695-77395265 


TAF9B RNA polymerase II, TATA box-binding protein-associated factor, 31 kDa 


N 


TCEAL6 


chrX: 1 01 397 1 22-1 01 397692 


Transcription elongation factor A (Sll)-like 6 


U 


TUSC3 


chr8: 1 5397909-1 5398479 


Tumor suppressor candidate 3 


PM 


UBL4A 


ch rX: 1 537 1 4886-1 537 1 5456 


Ubiquitin-like 4° 


U 


VCX2, VCX3A 


chrX:6451316-6452154 


Variable charge, X-linked 2, X-linked 3A 


U, N 


YIPF6 


chrX:67718891-67718965 


Yip 1 domain family, member 6 


C 


ZIC3 


chrX: 1 36649002-1 3664991 0 


Zic family member 3 (odd-paired homolog, Drosophila) 


N 


ZNF182 


chrX:4786291 1-47863428 


Zinc finger protein 182 


N 



Of note, all regions were hypermethylated in the PBC proband and only SMARCA i was differentially expressed in RT-PCR. 
a Positions of each gene based on GRCh37/hgl9. 

"For each gene product the localization is specified as nuclear (N), cytoplasmic (C), plasma membrane (PM), extracellular (E), unknown (II). 



and diseases were inflammatory, immunological, and connective 
tissue disorders. Furthermore the top canonical pathways involved 
were: T helper cell differentiation (p = 3.98E— 19), dendritic cell 
maturation (p = 1.39E— 13), altered T and B cell signaling in 
rheumatoid arthritis (p = 1.02E— 12), type I diabetes mellitus sig- 
naling (p= 1.04E— 11), and the crosstalk between dendritic cells 
and natural killer cells (p = 5.98E-ll) (Table4). 

DISCUSSION 

Primary biliary cirrhosis is considered a prototypic autoimmune 
disease because of the clinical homogeneity between patients and 
the relative consistency in natural history and pathology. Although 
relatively uncommon, several independent GWAS (7-13) have 
identified associations with transcription factors that further sug- 
gest a potential role for epigenetic shifts and thus our approach 
using this unique collection of DNA is a particularly impor- 
tant resource. We are aware of the numerous limitations of our 
study and that the observed changes in GEX may be stochas- 
tic rather than secondary to disease progression or involved in 
pathways involved in PBC pathogenesis, as suggested for other 



autoimmune diseases (29-32). The latter includes the possibility 
of portal hypertension and resulting leukopenia. 

We identified 60 DMR and 10 CNV between discordant MZ 
twins with 14 (20%) also differently expressed between PBC cases 
and control sisters, thus being stronger candidates as PBC bio- 
markers or determinants. One of the strengths of our study is the 
confirmation of identified genes by quantitative PCR and that this 
approach was extended also to genes identified in recent GWAS 
allowing identification of six genes differently expressed in PBC 
mononuclear cells. First, these genes support a down-modulation 
of Th2-cytokines such as IFIT1, an interferon type I signature 
represented by IFI44L, in favor of a fibrogenic phenotype as repre- 
sented by the IL6 up-regulation (33). Regarding this last observa- 
tion, we note the apparent discrepancy between DNA methylation 
and GEX of IL6 but we recognize that methylation does not 
fully correlate with GEX, and the difference could be explained 
by different mechanisms such as allele-specific methylation (34, 
35) (Table 4). Second, a single DMR-associated gene, i.e., hyper- 
methylated SMARCA1, manifested a reduced GEX confirmed in 
our RT-PCR study of sibling pairs. SMARCA1 is a transcription 
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regulator that modulates the chromatin structure and is involved 
in apoptosis, DNA damage, and differentiation. Moreover, the gene 
encodes for a member of the SWI/SNF family of proteins, which 
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are master regulators of GEX, regulating expression among oth- 
ers FOS, CSF-1, CRYAB, MIM-1, p21 (also known as CDKN1A), 
HSP70, VIM, and CCNA2; SWI/SNF has also been reported to 
modulate alternative splicing (36). Third, 5/7 sibling pairs had con- 
sistent dysregulation of CXCR5 being down-regulated in PBC lym- 
phocytes, which may reflect a compartmentalization of CXCR5+ 
cells within the liver or may reflect the chronic activation of B cells, 
as reported in rheumatoid arthritis (37). In fact, the chemokine 
receptor CXCR5 is expressed by B and T cells and controls their 
migration within lymph nodes while its ligand BCA-1/CXCL13 is 
present in lymph nodes and spleen and also in the liver. A down- 
regulation of CXRC5 is correlated with an increased production 
of IL-2, which may cause the production of immunoglobulins by 
B cells; IL-2 is normally produced by T cells during an immune 
response. IL-2 is also necessary during T cell differentiation in Treg, 
which are involved in self antigens recognition, which could result 
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FIGURE 2 | Expression fold changes at RT-PCR of the six genes found 
differences in PBC versus healthy sibling pairs analysis subjects. 



Table 3 | Genes showing consistent differences in DNA methylation, CNV, or expression 3 . 



Analysis 


ID 


Statu s b 


Entrez gene name 


Chr/base pair (bp) 


Localization 0 


GWAS 


CXCR5 


Down-regulated in 
three sibling pairs 0.44 


Chemokine (C-X-C motif) receptor 5 


chr11: 118764101-118766980 


PM 


GWAS 


HLA-B 


Down-regulated in 
three sibling pairs 0.14 


Major histocompatibility complex, class I, B 


chr6: 31321649-31324989 


PM 


GEX 


IFI44L 


Down-regulated in four 
sibling pairs 0.32 


Interferon-induced protein 44-like 


chrl: 79086088-79111830 


U 


GEX 


IFIT1 


Down-regulated in 
three sibling pairs 0.26 


Interferon-induced protein with tetratricopeptide 
repeats 1 


chr10: 91152303-91166244 


c 


GWAS 


IL6 


Up-regulated in three 
sibling pairs 2.5 


Interleukin 6 (interferon, beta 2) 


chr7: 22766798-22771620 


E 


MeDIP 


SMARCA1 


Down-regulated in 
three sibling pairs 0.33 


SWI/SNF-related, matrix-associated, actin-dependent 
regulator of chromatin, sub-family a, member 1 


chrX: 128484989-128485617 


N 



'List of genes evaluated with RT-PCR. 

Each of the tested genes showed consistent differences in methylation (MeDIP), copy number variation (CNV), or gene expression (GEX) (in the current study) or 
were candidate genes from genome-wide association studies (GWAS) studies. 

b Status: Log(RQ) is the logarithm of fold change= , which identifies the expression ratio: a positive Log(RQ) implies that the gene is up-regulated. 
c For each gene product the localization is specified as nuclear (N), cytoplasmic (C), plasma membrane (PM), extracellular (E), unknown (U). 
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Table 4 | List of PBC-associated genes analyzed with ingenuity 
pathways analysis software 8.0 IPA. 

Genes IPA findings 

IL6 Up-regulation of human IL6 protein in serum is associated 

with human PBC 

IL4 Up-regulation of human IL4 mRNA in liver is associated with 

human PBC 

IL17A Up-regulation of human IL-17 (IL17A) mRNA in liver is 
associated with human PBC 

IL13 Up-regulation of human IL13 mRNA in liver is associated with 

human PBC 

IL12RB2 Mutant human IL12RB2 gene (SNP substitution mutation 
(rs3790567) is associated with human PBC 
(p-value = 2.76E-11) 

IL12 Mutant human IL12A gene [SNP substitution mutation, allelic 

variations: A/G (rs4679868)] is associated with human PBC 
Mutant human IL12A gene (SNP substitution mutation 
(rs574808) is associated with human PBC 
(p-value = 1.88E-13) 

HLA- Mutant human HLA-DQB1 gene (SNP substitution mutation 

DQB1 (rs9275312) is associated with human PBC 

Mutant human HLA-DQB1 gene (SNP substitution mutation 
(rs2856683) is associated with human PBC 
(p-value = 1.78E-19) 

Mutant human HLA-DQB1 gene (SNP substitution mutation 

(rs7775228) is associated with human PBC 

Mutant human HLA-DQB1 gene (SNP substitution mutation 

(rs9275390) is associated with human PBC 

Mutant human HLA-DQB1 gene (SNP substitution mutation 

(rs9357152) is associated with human PBC 

HLA- Mutant human HLA-DPB1 gene (SNP substitution mutation 

DPB1 (rs9277535) is associated with human PBC 

Mutant human HLA-DPB1 gene (SNP substitution mutation 

(rs2281389) is associated with human PBC 

Mutant human HLA-DPB1 gene (SNP substitution mutation 

(rs660895) is associated with human PBC 

Mutant human HLA-DPB1 gene (SNP substitution mutation 

(rs9277565) is associated with human PBC 

CTLA4 Mutant human CTLA4 gene is associated with human PBC 



Of these, IL6 was found differentially expressed by RT-PCR in discordant sisters. 

in autoimmunity (38). Of note, following B cell activation and dif- 
ferentiation into plasma cells and memory cells, CXCR5 becomes 
down-regulated while the same effect is induced in vitro following 
anti-CD40 stimulation (39) and CD40L methylation appears to 
be altered in PBC (40). Fourth, HLA-B is also down-regulated in 
PBC, similar to several types of cancer (41-43). 

The majority of the identified genes map on the X chromosome, 
in agreement with the female predominance of the disease, and 
are involved in many cellular pathways. Our group in a previous 



work assessed the expression of 125 genes with variable X inacti- 
vation status and found that two genes (CLIC2 and PIN4) were 
consistently down-regulated in PBC affected twin of discordant 
pairs (17). Three genes are differentially methylated in lympho- 
cytes of patients with PBC and systemic sclerosis (32) and may 
thus be representative of general autoimmunity or fibrosis devel- 
opment; these genes include MTM1 hypermethylated in PBC and 
in systemic sclerosis while SSR4 and IGH3G are hypomethylated 
in both diseases. Of note, a recent study reported the up-regulation 
of the X-linked costimulatory molecule CD40L (40) but our data 
failed to confirm such hypomethylation in our cohort. The CNV 
differences observed in our MZ twin set warrant some further 
observations as the de novo post-twinning CNV frequency was 
estimated to be as high as 5% on a per-individual basis or 10% per 
twinning event (21). While the impact of CNV on GEX can vary 
(44), it would be of great interest to obtain parental information to 
determine the origin and timing of CNV in the offspring. On the 
other hand, there are several limitations to our data. PBC is rela- 
tively uncommon and our DNA collection reflects a several-year 
worldwide search; it is nonetheless a limited dataset. In addition, 
there is only limited information available using PBMC. PBC is an 
organ-specific disease affecting small intrahepatic bile ducts and 
thus studies of the portal infiltrating lymphocytes will provide 
a more valuable resource as would a detailed and well-defined 
lymphoid cell populations. These comments notwithstanding, the 
data obtained are intriguing and consistent with our thesis that 
one explanation for discordant MZ twins is DNA changes on 
the critical genomic element involved in disease susceptibility 
and these observations should be recapitulated also in unre- 
lated pairs of patients and controls. With the increased interest 
in the balance between genetic susceptibility, it becomes criti- 
cal for research groups to combine resources and improve access 
to clinical material and data that permits more extensive stud- 
ies and the potential for more powerful statistical analysis and 
interpretation. 
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